Introduction
Current Market Practice
App Overview
Scraping the Data
| 1. BTO Data
| 2. HDB Resale Data
| 3. Condominium Resale Data
| 4. MRT Data
| 5. Schools Data
| 6. Malls Data
| 7. Nature Parks Data
| 8. Clinics Data
| 9. Preparing Data for Predictive Analysis
App Functions
| 1. Home Page
| 2. Amenities Tab
| 3. Comparison Tab
| 4. Prediction Model & Analysis Tab
| Findings
Further Development
Conclusion
This application was developed to aid prospective Build-to-order (BTO) housing flat owners. Currently, there are many factors that one considers when buying a BTO - some factors include the presence of nearby facilities (transport: MRTs, education: schools, medical: clinics and entertainment: malls), the future value of the house in the event of selling, the historical prices of other flats in that area as well as comparison to condominiums and resale flats to evaluate other options. Thus, the “BTO Analysis Buddy” was created as an all-in-one solution, conveniently guiding the user step-by-step to arrive at a solution that fully meets their needs.
Currently, buyers have to utilise google and visit various websites to gather their own information. Examples of some websites include: Seedly’s BTO guide , MoneySmart’s BTO guide , “How I Bought My HDB Resale Flat” by TheSmartLocal and many more. However, it is time consuming and tedious to visit each page, and with every page presenting different information in a different way, it is challenging to put together and synthesize the various information. One would find it difficult to shortlist a suitable BTO, and even after shortlisting, one still has to search up each address into Google Maps to view the nearby amenities. This process can take weeks and there is much potential for it to be more efficient.
Currently, there is also no website or app that integrates BTO with HDB Resale or Condominiums, making it difficult for potential buyers to compare them. Although there are websites that list HDB Resale flats like PropertyGuru and SRX, they lack the convenience that our app provides, like easy comparison and filtering by multiple criteria.
The “BTO Analysis Buddy” Application has 4 tabs, each with different added functionality. Even though users can use each tab independently, for explanation purposes, the most common user journey is described as such:
Home Page: Users can filter for housing type to view BTO, HDB resale or condominiums, as well as filter for nearby amenities. The map tiles will also show the estimated pricing range. After filtering, one can click on the marker to view details and add the listing to a comparison table.
Comparison: This tab allows one to easily compare between shortlisted flats.
Analysis: This tab gives one more information about the resale value forecasts and information about factors that affect the future price. Thus, this can be an important factor in the decision-making process.
Amenities: After deciding or shortlisting flats, this tab shows an in-depth view of the exact amenities near the flat that one has selected
The BTO data was obtained from HDB website. For the files containing BTO, resale HDB flats and resale condominiums data, we added an additional column containing the minimum distance between the respective housing in all files and 5 different amenities (MRT stations, Parks, Malls, Schools and Clinics). This is done by creating a double for loop which loops through every housing in each file and also each amenity. We then append these minimum distances to the respective files containing the different housing types.
The HDB resale data was scraped from SRX website using the codes listed in CrawlSRX.Rmd file. As part of the codes, the longitude and latitude data was obtained by passing the postal code to an API in OneMap.sg.
The condominium resale data was scraped from SRX website using the codes listed in CrawlSRX.Rmd file. As part of the codes, the longitude and latitude data was obtained by passing the postal code to an API in OneMap.sg.
The MRT data was taken from Data.Gov.Sg and also Kaggle. These files were then merged to get the address, station name, station and coordinates of the MRT stations.
…
As there are no datasets of primary and secondary schools readily available on the internet, we used the function “geocode” from the “ggmap” package to scrape for the geo coordinates of primary and secondary schools in Singapore. However as some of the latitude and longitude data scraped from Google maps is wrong, we had to do some data cleaning and changed those data manually.
The malls data was scrapped from the malls website to get all the shopping malls in Singapore listed in that website. Using geocode, we then get the malls latitude and longitude based on the addresses.
…
The raw data of the names and location of the nature parks are gotten from data.gov.sg , as a kml file. Then, we converted the kml file into a dataframe and extracted the necessary data. Some data cleaning was done as the original data had unclean formats when converted into a dataframe.
As there are many CHAS clinics around Singapore and since these clinics can be easily accessed, it is highly likely that most Singaporeans would visit these clinics when they fall sick. The CHAS clinics kml was taken from data.gov.sg and we scraped data from the kml file. There was no data cleaning done for this file as all information was already presented accurately.
From the cleaned resale data, categorical variables are split into different columns through one-hot encoding. Data such as closest distance from different amenities (malls, mrt, parks, clinic and schools) are added as these might increase the accuracy of the prediction model. Age of the house was computed from the year in which the HDB was built. Number of bus stops around the house were also added.
…
…
…
Our homepage shows an interactive map that allows users to filter according to the type of apartments they want (e.g. BTO Flats, HDB Resale Flats or Condominium). For BTOs, we will show all the flats that were released by HDB in 2020. For SRX and Condominium, the map will show all available units listed for sale on SRX.
We understand that different buyers have different requirements. Some would like flats near MRT stations while couples with cars may not be too concerned about the distance of their flat from the MRT station. These couples would prioritise other facilities such as shopping centres or parks. We hence created an additional filter that allows users to select the facilities they want, as well as the maximum distance they are willing to accept. The map will then update to show only units that are within the selected distance for those facilities chosen.
For example, suppose a user is not interested in MRT stations, and would like to search for HDB resale flats that are within 1km of malls, parks, clinics and schools. The intended result is shown below.
Fig.1: HDB Resale Flats That Are Within 1km of Malls, Parks, Clinics, and Schools
One problem we have identified with the use of map display is that it shows one unit per marker. Multiple units on the same location will overlap and the details cannot be selected. However, we know that multiple units in the same block can be put on sale. For example, a BTO can consist of 3-Room, 4-Room and 5-Room flats in one block. On the map, this will only show up as one marker as they all share the same longitude and latitude. Similarly, multiple flats in a block/condo can be put on sale but this will only show up as one marker as they share the same longitude and latitude. To overcome this problem, we have created a ‘Select Address’ button for each pop-up which will allow users to click to show more information. This will show all the units in the block that are being sold, grouped by characteristics such as number of bedrooms and bathrooms as well as asking price. The grouping will prevent listing of multiple rows of units with the same exact characteristics, which can clutter the screen. To provide further convenience to the users, the display table can be shifted around, and will turn semi-transparent when the mouse is outside the table. With these added features, users are provided with the flexibility to customise the display to their own liking.
Fig.2: “Select Address” Button and Display Table
We understand that the amount of data may be overwhelming and users may want to narrow down to a few selected units for more in-depth analysis. We hence created a ‘add to comparison’ button in each pop-up, which will allow users to add interested units to a comparison table for further analysis. This would integrate with the function of the comparison tab, allowing a seamless experience.
To further value-add to the users, we also attempted to show the price differentiation of HDB resale flats between towns through the use of colour gradient. In this way, users would immediately know which are the more expensive and which are the cheaper towns, and can narrow down their searches to specific areas based on their budget. This feature can be enabled or disabled depending on the user’s preference to provide more customisation. From the picture below, we can see that HDB resale flats in the north (i.e. Sembawang, Yishun) tend to be a lot cheaper than HDB resale flats in the south (i.e. Bukit Merah, Queenstown).
Fig.3: Price Range of Resale Flats
Finally, our app aims to provide as much customisation as possible, and besides the features already mentioned above (such as the ability to move the comparison and amenities filter tables, and the ability to disable the price range gradient), users are also able to select different types of map to suit their own preferences.
Since BTO buyers are all couples who are most likely first time buyers, we created an additional “Amenities” tab to help them to have a clearer picture of the type of amenities that are near the BTO that they are planning to bid for. The amenities that BTO buyers can filter for are MRT stations, parks, schools, clinics and malls. All these are especially relevant for BTO buyers who are married or soon to be married couples as they are likely to stay at this location as they expand their family. Hence this page serves to help them have a clearer picture of BTO flats which have the specific amenities that they are looking for nearby.
Fig.4: Amenities Tab
The Comparison tab serves as a quick reference for the user to compare different properties, showing their metrics side-by-side. The metrics that are shown are the Property Name, the Category (Resale or BTO), Neighbourhood (Area in Singapore), Address, Property Type, Postal Code, Number of Bathrooms, Number of Bedrooms, Size (in square metres), Predicted Value, Predicted Value in 5 Years, and Current Price. Only BTO and Resale HDB properties are included, as our app primarily caters to first-time buyers, and they are not likely to compare HDB flats with condominiums.
Fig.5: Comparison Tab
In this interface, users can select the postal code for the Resale HDB flats from the dropdown menu. Users can also type the postal code in if they prefer. When the button “Add Address” is clicked, a row (or a few rows, depending on whether the particular property has many types in one building, like having 2 Rooms, 3 Rooms, or 4 Rooms within the same building) will be added to the dataframe on the right. A row can also be added if the user presses the “Add to Comparison” button in the Home Page tab.
The “Clear All” button clears all the rows, and the user can also input rows he wants to delete and press the “Delete Row” button to complete the action. The user can additionally delete by clicking on the row and then pressing the “Delete Row” button.
This allows more flexibility and customisability in comparing different properties. We hope that by listing them together, the user can see all of them at a glance, which brings convenience to his/her process of finding a new home.
Most buyers would consider future resale value as an important criteria for buying a BTO or Resale flat. Hence, the table from the comparison tab also includes the predicted value today and predicted value 5 years later. The predicted value today is important for buyers who are considering purchasing resale HDB and want to know if it is a good buy or not, by comparing with the asking price. The 5 years predicted value will be important for BTO buyers that want to estimate its resale value after the Minimum Occupancy Period (5 years) as an investment strategy.
After cleaning the data for prediction (from previous section) and running the linear regression model, variables with less than 5% significance level are removed (apart from number of bus_stops as the team feels that this factor can be engineered to include distance from these bus stops in the future, which may increase its prediction). The model has an adjusted r-square of 81.97% which means that the model is fairly accurate in predicting the actual value.
Fig.6: Linear Regression Model
For the 5 years predicted value, the predicted value today is being adjusted by discounting at a rate based on its age 5 years later. The team uses the bala table (a valuation table to get the percentage of freehold value: https://www.99.co/blog/singapore/wp-content/uploads/2020/04/GCDjTNq.png) to get the estimated discount rate, and add to the expected inflation rate at 1.459% (https://www.ceicdata.com/en/indicator/singapore/forecast-consumer-price-index-growth).
To allow users to have a better comparison between values of HDB and the variables, a visualisation tab is created. The tab allows users to compare variables such size, floor level of house, house type (e.g. 3 rooms), age of house, distance from Amenities (e.g. mrt) against its predicted and market value today. Users can also compare region prices (e.g. Ang Mo Kio) based on predicted value and market/historical value.
Fig.7: Region Price
There is a huge difference between predicted and market values when comparing within a variable (e.g. see below for size of house comparison). This is because the market values are being confounded by other variables, whereas predicted values would normalise other variables (using the coefficient value to estimate). Hence, the predicted values are considered to be more accurate as it truly reflects the value of the house based on that selected variable (ignoring effects by other variables).
Fig.8: Size of House
The blue points represent resale prices based on sizes of the hdb flats, and the black line represents the average value today based on the prediction model. From this, users can compare how much they should be paying given the size of houses (keeping all the factors constant). Hence, if a user can find a house with price close or below the predicted line, it means that the house is a good buy.
Fig.9
From this chart, users can see that floor level does have an impact on housing prices but it is minimal compared to type of houses (3 rooms, 4 rooms etc). This could be that type of houses are related to sizes of houses. Users can also estimate how much more they should be paying if they high floor compared to low floor, or for different room types.
Fig.10
We can see that prices of flats decrease when the age of houses increases. Moreover, older houses (30 years and above) tend to have a larger price range. The orange line represents the average price based on a prediction model. This serves 2 purposes. Users can see on average, how much the value of their flat is likely to decrease over the years. Secondly, if they are buying a resale flat, they can see based on the house age, how much they should be paying.
Fig.11
This chart shows how distances from amenities affect the housing prices, and by how much. Users can see that prices of houses increase as they are further away from schools. One possible explanation is that houses further from the schools will be less noisy, hence value increases when houses are further from schools. As distances from MRT and Malls increase, the price decreases. MRT has a much greater impact on house prices which shows that buyers want houses that are close to MRT. From this, users will know that they should take into consideration the distance from MRT when buying the house as this will have a significant impact on its prices.
Fig.12
This shows the average price (user can select between historical or based on model) of the HDB houses based on regions (e.g. Ang Mo Kio). The darker the color, the more expensive the region. From this chart, users can see that southern regions (bukit timah, Queenstowns) have higher prices.
From the analysis tab, we can see that the price range of houses is huge given the same factor value. For example, given the house size of 95 sqm, the price difference between the min and max price is around $800,000. This is evident from the age of house analysis where min and max prices diff significantly, given the same age. This shows that there is no one single factor that determines the price of the house, as there are numerous confounding factors. Hence, users should take into consideration various factors when estimating the price of the house, and to be cautious that they are paying at a fair value in the resale market.
Currently, this application is catered towards first-time flat buyers who are more interested in HDB flats. For future development, the team can look into developing different versions of the application, for instance catered to condominium buyers. If a condominium tool version was developed, the future pricing analysis would differ, and users may also have different requirements - for instance, they may want a comparison with landed properties. Future development of this application can include more facilities such as stadiums, religious sites, hawker centres and carparks to provide a more comprehensive overview for users. Upcoming MRTs and amenities can also be included, which would help users to make better decisions, and make future price analysis even more insightful. This is especially so for non-mature estates such as Tengah, where many new amenities are scheduled to be opened in the next few years.
The BTO analysis buddy brings convenience to users and is designed to be aesthetically pleasing, user-friendly and most importantly - combining filtering, comparison and predictive features that altogether makes the BTO buying process a breeze.